A Whole Sentence Maximum Entropy Language Model
نویسنده
چکیده
We introduce a new kind of language model, which models whole sentences or utterances directly using the Maximum Entropy paradigm. The new model is conceptually simpler, and more naturally suited to modeling whole-sentence phenomena, than the conditional ME models proposed to date. By avoiding the chain rule, the model treats each sentence or utterance as a \bag of features", where features are arbitrary computable properties of the sentence. The model is unnor-malizable, but this does not interfere with training (done via sampling) or with use. Using the model is computationally straightforward. The main computational cost of training the model is in generating sample sentences from a Gibbs distribution. Interestingly, this cost has different dependencies, and is potentially lower, than in the comparable conditional ME model. 1 Motivation Conventional statistical language models estimate the probability of an sentence s by using the chain rule to decompose it into a product of conditional probabilities: Pr(s) def = Pr(w 1 : : :w n) = n Y i=1 Pr(w i jw 1 : : :w i?1) def = n Y i=1 Pr(w i jh) (1) where h def = fw 1 ; : : :; w i?1 g is the history when predicting word w i. The vast majority of work in statistical language modeling is then devoted to estimating terms of the form Pr(wjh). The application of the chain rule is technically harmless since it uses an exact equality, not an approximation. However, terms like Pr(wjh) may not be the best way to think about estimating Pr(s): 1. Global sentence information such as grammaticality is awkward to encode in a conditional framework.
منابع مشابه
Improvement of a Whole Sentence Maximum Entropy Language Model Using Grammatical Features
In this paper, we propose adding long-term grammatical information in a Whole Sentence Maximun Entropy Language Model (WSME) in order to improve the performance of the model. The grammatical information was added to the WSME model as features and were obtained from a Stochastic Context-Free grammar. Finally, experiments using a part of the Penn Treebank corpus were carried out and significant i...
متن کاملEfficient sampling and feature selection in whole sentence maximum entropy language models
Conditional Maximum Entropy models have been successfully applied to estimating language model probabilities of the form , but are often too demanding computationally. Furthermore, the conditional framework does not lend itself to expressing global sentential phenomena. We have recently introduced a non-conditional Maximum Entropy language model which directly models the probability of an entir...
متن کاملFast parameter estimation for joint maximum entropy language models
This paper discusses efficient parameter estimation methods for joint (unconditional) maximum entropy language models such as whole-sentence models. Such models are a sound framework for formalizing arbitrary linguistic knowledge in a consistent manner. It has been shown that general-purpose gradient-based optimization methods are among the most efficient algorithms for estimating parameters of...
متن کاملUsing Perfect Sampling in Parameter Estimation of a Whole Sentence Maximum Entropy Language Model
The Maximum Entropy principle (ME) is an appropriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Model (WSME) could be used. Until now MonteCarlo Markov Chains (MCMC) sampling techniques has been use...
متن کاملDiscriminative maximum entropy language model for speech recognition
This paper presents a new discriminative language model based on the whole-sentence maximum entropy (ME) framework. In the proposed discriminative ME (DME) model, we exploit an integrated linguistic and acoustic model, which properly incorporates the features from n-gram model and acoustic log likelihoods of target and competing models. Through the constrained optimization of integrated model, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997